{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "gpuType": "T4"
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "source": [
        "# All about vector quantization\n",
        "\n",
        "txtai supports a number of approximate nearest neighbor (ANN) libraries for vector storage. This includes [Faiss](https://github.com/facebookresearch/faiss), [Hnswlib](https://github.com/nmslib/hnswlib), [Annoy](https://github.com/spotify/annoy), [NumPy](https://github.com/numpy/numpy) and [PyTorch](https://github.com/pytorch/pytorch). Custom implementations can also be added.\n",
        "\n",
        "The default ANN for txtai is Faiss. Faiss has by far the largest array of configurable options in building an ANN index. This article will cover quantization and different approaches that are possible along with the tradeoffs."
      ],
      "metadata": {
        "id": "-0mtX6Faensl"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Install dependencies\n",
        "\n",
        "Install `txtai` and all dependencies."
      ],
      "metadata": {
        "id": "1LYCxMqcje3z"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "aBdRdW-xeh6l"
      },
      "outputs": [],
      "source": [
        "%%capture\n",
        "!pip install git+https://github.com/neuml/txtai pytrec_eval rank-bm25 elasticsearch psutil"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Preparing the datasets\n",
        "\n",
        "First, let's download a subset of the datasets from the BEIR evaluation framework. We'll also retrieve the standard txtai benchmark script. These will be used to help judge the accuracy of quantization methods."
      ],
      "metadata": {
        "id": "4qEigu7Ajo27"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "%%capture\n",
        "import os\n",
        "\n",
        "# Get benchmarks script\n",
        "os.system(\"wget https://raw.githubusercontent.com/neuml/txtai/master/examples/benchmarks.py\")\n",
        "\n",
        "# Create output directory\n",
        "os.makedirs(\"beir\", exist_ok=True)\n",
        "\n",
        "if os.path.exists(\"benchmarks.json\"):\n",
        "  os.remove(\"benchmarks.json\")\n",
        "\n",
        "# Download subset of BEIR datasets\n",
        "datasets = [\"nfcorpus\", \"arguana\", \"scifact\"]\n",
        "for dataset in datasets:\n",
        "  url = f\"https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/{dataset}.zip\"\n",
        "  os.system(f\"wget {url}\")\n",
        "  os.system(f\"mv {dataset}.zip beir\")\n",
        "  os.system(f\"unzip -d beir beir/{dataset}.zip\")"
      ],
      "metadata": {
        "id": "uFmK9srYjzPT"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Evaluation\n",
        "\n",
        "Next, we'll setup the scaffolding to run evaluations."
      ],
      "metadata": {
        "id": "McJE_K3lnLTQ"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "import pandas as pd\n",
        "import yaml\n",
        "\n",
        "def writeconfig(dataset, quantize):\n",
        "  sources = {\"arguana\": \"IVF11\", \"nfcorpus\": \"IDMap\", \"scifact\": \"IVF6\"}\n",
        "  config = {\n",
        "    \"embeddings\": {\n",
        "      \"batch\": 8192,\n",
        "      \"encodebatch\": 128,\n",
        "      \"faiss\": {\n",
        "          \"sample\": 0.05\n",
        "      }\n",
        "    }\n",
        "  }\n",
        "\n",
        "  if quantize and quantize[-1].isdigit() and int(quantize[-1]) < 4:\n",
        "    # Use vector quantization for 1, 2 and 3 bit quantization\n",
        "    config[\"embeddings\"][\"quantize\"] = int(quantize[-1])\n",
        "  elif quantize:\n",
        "    # Use Faiss quantization for other forms of quantization\n",
        "    config[\"embeddings\"][\"faiss\"][\"components\"] = f\"{sources[dataset]},{quantize}\"\n",
        "\n",
        "  # Derive name\n",
        "  name = quantize if quantize else \"baseline\"\n",
        "\n",
        "  # Derive config path and write output\n",
        "  path = f\"{dataset}_{name}.yml\"\n",
        "  with open(path, \"w\") as f:\n",
        "    yaml.dump(config, f)\n",
        "\n",
        "  return name, path\n",
        "\n",
        "def benchmarks():\n",
        "  # Read JSON lines data\n",
        "  with open(\"benchmarks.json\") as f:\n",
        "    data = f.read()\n",
        "\n",
        "  df = pd.read_json(data, lines=True).sort_values(by=[\"source\", \"ndcg_cut_10\"], ascending=[True, False])\n",
        "  return df[[\"source\", \"name\", \"ndcg_cut_10\", \"map_cut_10\", \"recall_10\", \"P_10\", \"disk\"]].reset_index(drop=True)\n",
        "\n",
        "# Runs benchmark evaluation\n",
        "def evaluate(quantize=None):\n",
        "  for dataset in datasets:\n",
        "    # Build config based on requested quantization\n",
        "    name, config = writeconfig(dataset, quantize)\n",
        "\n",
        "    command = f\"python benchmarks.py -d beir -s {dataset} -m embeddings -c \\\"{config}\\\" -n \\\"{name}\\\"\"\n",
        "    os.system(command)\n"
      ],
      "metadata": {
        "id": "Olhj91QwmsL-"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Establish a baseline\n",
        "\n",
        "Before introducing vector quantization, let's establish a baseline of accuracy per source without quantization. The following table shows accuracy metrics along with the disk storage size in KB."
      ],
      "metadata": {
        "id": "EfeGhKjvuYoQ"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "evaluate()\n",
        "benchmarks()"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        },
        "id": "i-_eYDf7ryhL",
        "outputId": "f2c22805-5eb3-402d-f5aa-be3117d768f9"
      },
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "     source      name  ndcg_cut_10  map_cut_10  recall_10     P_10   disk\n",
              "0   arguana  baseline      0.47886     0.38931    0.76600  0.07660  13416\n",
              "1  nfcorpus  baseline      0.30893     0.10789    0.15315  0.23622   5517\n",
              "2   scifact  baseline      0.65273     0.60386    0.78972  0.08867   7878"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-47ce388a-22a1-4bf0-9a3d-8a249b6d8290\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>source</th>\n",
              "      <th>name</th>\n",
              "      <th>ndcg_cut_10</th>\n",
              "      <th>map_cut_10</th>\n",
              "      <th>recall_10</th>\n",
              "      <th>P_10</th>\n",
              "      <th>disk</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>arguana</td>\n",
              "      <td>baseline</td>\n",
              "      <td>0.47886</td>\n",
              "      <td>0.38931</td>\n",
              "      <td>0.76600</td>\n",
              "      <td>0.07660</td>\n",
              "      <td>13416</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>nfcorpus</td>\n",
              "      <td>baseline</td>\n",
              "      <td>0.30893</td>\n",
              "      <td>0.10789</td>\n",
              "      <td>0.15315</td>\n",
              "      <td>0.23622</td>\n",
              "      <td>5517</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>scifact</td>\n",
              "      <td>baseline</td>\n",
              "      <td>0.65273</td>\n",
              "      <td>0.60386</td>\n",
              "      <td>0.78972</td>\n",
              "      <td>0.08867</td>\n",
              "      <td>7878</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-47ce388a-22a1-4bf0-9a3d-8a249b6d8290')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-47ce388a-22a1-4bf0-9a3d-8a249b6d8290 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-47ce388a-22a1-4bf0-9a3d-8a249b6d8290');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-fe6bb89a-077e-48c8-b298-9456450f8441\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-fe6bb89a-077e-48c8-b298-9456450f8441')\"\n",
              "            title=\"Suggest charts.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-fe6bb89a-077e-48c8-b298-9456450f8441 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 10
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Quantization\n",
        "\n",
        "The two main types of vector [quantization](https://en.wikipedia.org/wiki/Quantization_(signal_processing)) are scalar quantization and product quantization.\n",
        "\n",
        "Scalar quantization maps floating point data to a series of integers. For example, 8-bit quantization splits the range of floats into 255 buckets. This cuts data storage down by 4 when working with 32-bit floats, since each dimension now only stores 1 byte vs 4. A more dramatic version of this is binary or 1-bit quantization, where the floating point range is cut in half, 0 or 1. The trade-off as one would expect is accuracy.\n",
        "\n",
        "Product quantization is similar in that the process bins a floating point range into codes but it's more complex. This method splits vectors across dimensions into subvectors and runs those subvectors through a clustering algorithm. This can lead to a substantial reduction in data storage at the expense of accuracy like with scalar quantization. The [Faiss documentation](https://github.com/facebookresearch/faiss/wiki#research-foundations-of-faiss) has a number of great papers with more information on this method.\n",
        "\n",
        "Quantization is available at the vector processing and datastore levels in txtai. In both cases, it requires an ANN backend that can support integer vectors. Currently, only Faiss, NumPy and Torch are supported.\n",
        "\n",
        "Let's benchmark a variety of quantization methods."
      ],
      "metadata": {
        "id": "tVv825vJ0uc0"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Evaluate quantization methods\n",
        "for quantize in [\"SQ1\", \"SQ4\", \"SQ8\", \"PQ48x4fs\", \"PQ96x4fs\", \"PQ192x4fs\"]:\n",
        "  evaluate(quantize)\n",
        "\n",
        "# Show benchmarks\n",
        "benchmarks()"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 708
        },
        "id": "s9tB_b9ttYSM",
        "outputId": "16d49e7b-7476-4b6d-d64a-bed3b8627f0a"
      },
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "      source       name  ndcg_cut_10  map_cut_10  recall_10     P_10   disk\n",
              "0    arguana   baseline      0.47886     0.38931    0.76600  0.07660  13416\n",
              "1    arguana        SQ8      0.47781     0.38781    0.76671  0.07667   3660\n",
              "2    arguana        SQ4      0.47771     0.38915    0.76174  0.07617   2034\n",
              "3    arguana  PQ192x4fs      0.46322     0.37341    0.75391  0.07539   1260\n",
              "4    arguana   PQ96x4fs      0.43744     0.35052    0.71906  0.07191    844\n",
              "5    arguana        SQ1      0.42604     0.33997    0.70555  0.07055    795\n",
              "6    arguana   PQ48x4fs      0.40220     0.31653    0.67852  0.06785    637\n",
              "7   nfcorpus        SQ4      0.31028     0.10758    0.15417  0.23839    751\n",
              "8   nfcorpus        SQ8      0.30917     0.10810    0.15327  0.23591   1433\n",
              "9   nfcorpus   baseline      0.30893     0.10789    0.15315  0.23622   5517\n",
              "10  nfcorpus  PQ192x4fs      0.30722     0.10678    0.15168  0.23467    433\n",
              "11  nfcorpus   PQ96x4fs      0.29594     0.09929    0.13996  0.22693    262\n",
              "12  nfcorpus        SQ1      0.26582     0.08579    0.12658  0.19907    237\n",
              "13  nfcorpus   PQ48x4fs      0.25874     0.08100    0.11912  0.19567    177\n",
              "14   scifact        SQ4      0.65299     0.60328    0.79139  0.08867   1078\n",
              "15   scifact   baseline      0.65273     0.60386    0.78972  0.08867   7878\n",
              "16   scifact        SQ8      0.65149     0.60193    0.78972  0.08867   2050\n",
              "17   scifact  PQ192x4fs      0.64046     0.58823    0.78933  0.08867    622\n",
              "18   scifact   PQ96x4fs      0.62256     0.57773    0.74861  0.08400    375\n",
              "19   scifact        SQ1      0.58724     0.53418    0.73989  0.08267    338\n",
              "20   scifact   PQ48x4fs      0.52292     0.46611    0.68744  0.07700    251"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-95eadd9f-8a9c-4c3b-a201-5c941a536783\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>source</th>\n",
              "      <th>name</th>\n",
              "      <th>ndcg_cut_10</th>\n",
              "      <th>map_cut_10</th>\n",
              "      <th>recall_10</th>\n",
              "      <th>P_10</th>\n",
              "      <th>disk</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>arguana</td>\n",
              "      <td>baseline</td>\n",
              "      <td>0.47886</td>\n",
              "      <td>0.38931</td>\n",
              "      <td>0.76600</td>\n",
              "      <td>0.07660</td>\n",
              "      <td>13416</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>arguana</td>\n",
              "      <td>SQ8</td>\n",
              "      <td>0.47781</td>\n",
              "      <td>0.38781</td>\n",
              "      <td>0.76671</td>\n",
              "      <td>0.07667</td>\n",
              "      <td>3660</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>arguana</td>\n",
              "      <td>SQ4</td>\n",
              "      <td>0.47771</td>\n",
              "      <td>0.38915</td>\n",
              "      <td>0.76174</td>\n",
              "      <td>0.07617</td>\n",
              "      <td>2034</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>arguana</td>\n",
              "      <td>PQ192x4fs</td>\n",
              "      <td>0.46322</td>\n",
              "      <td>0.37341</td>\n",
              "      <td>0.75391</td>\n",
              "      <td>0.07539</td>\n",
              "      <td>1260</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>arguana</td>\n",
              "      <td>PQ96x4fs</td>\n",
              "      <td>0.43744</td>\n",
              "      <td>0.35052</td>\n",
              "      <td>0.71906</td>\n",
              "      <td>0.07191</td>\n",
              "      <td>844</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>arguana</td>\n",
              "      <td>SQ1</td>\n",
              "      <td>0.42604</td>\n",
              "      <td>0.33997</td>\n",
              "      <td>0.70555</td>\n",
              "      <td>0.07055</td>\n",
              "      <td>795</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>arguana</td>\n",
              "      <td>PQ48x4fs</td>\n",
              "      <td>0.40220</td>\n",
              "      <td>0.31653</td>\n",
              "      <td>0.67852</td>\n",
              "      <td>0.06785</td>\n",
              "      <td>637</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7</th>\n",
              "      <td>nfcorpus</td>\n",
              "      <td>SQ4</td>\n",
              "      <td>0.31028</td>\n",
              "      <td>0.10758</td>\n",
              "      <td>0.15417</td>\n",
              "      <td>0.23839</td>\n",
              "      <td>751</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8</th>\n",
              "      <td>nfcorpus</td>\n",
              "      <td>SQ8</td>\n",
              "      <td>0.30917</td>\n",
              "      <td>0.10810</td>\n",
              "      <td>0.15327</td>\n",
              "      <td>0.23591</td>\n",
              "      <td>1433</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9</th>\n",
              "      <td>nfcorpus</td>\n",
              "      <td>baseline</td>\n",
              "      <td>0.30893</td>\n",
              "      <td>0.10789</td>\n",
              "      <td>0.15315</td>\n",
              "      <td>0.23622</td>\n",
              "      <td>5517</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>10</th>\n",
              "      <td>nfcorpus</td>\n",
              "      <td>PQ192x4fs</td>\n",
              "      <td>0.30722</td>\n",
              "      <td>0.10678</td>\n",
              "      <td>0.15168</td>\n",
              "      <td>0.23467</td>\n",
              "      <td>433</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>11</th>\n",
              "      <td>nfcorpus</td>\n",
              "      <td>PQ96x4fs</td>\n",
              "      <td>0.29594</td>\n",
              "      <td>0.09929</td>\n",
              "      <td>0.13996</td>\n",
              "      <td>0.22693</td>\n",
              "      <td>262</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>12</th>\n",
              "      <td>nfcorpus</td>\n",
              "      <td>SQ1</td>\n",
              "      <td>0.26582</td>\n",
              "      <td>0.08579</td>\n",
              "      <td>0.12658</td>\n",
              "      <td>0.19907</td>\n",
              "      <td>237</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>13</th>\n",
              "      <td>nfcorpus</td>\n",
              "      <td>PQ48x4fs</td>\n",
              "      <td>0.25874</td>\n",
              "      <td>0.08100</td>\n",
              "      <td>0.11912</td>\n",
              "      <td>0.19567</td>\n",
              "      <td>177</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>14</th>\n",
              "      <td>scifact</td>\n",
              "      <td>SQ4</td>\n",
              "      <td>0.65299</td>\n",
              "      <td>0.60328</td>\n",
              "      <td>0.79139</td>\n",
              "      <td>0.08867</td>\n",
              "      <td>1078</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>15</th>\n",
              "      <td>scifact</td>\n",
              "      <td>baseline</td>\n",
              "      <td>0.65273</td>\n",
              "      <td>0.60386</td>\n",
              "      <td>0.78972</td>\n",
              "      <td>0.08867</td>\n",
              "      <td>7878</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>16</th>\n",
              "      <td>scifact</td>\n",
              "      <td>SQ8</td>\n",
              "      <td>0.65149</td>\n",
              "      <td>0.60193</td>\n",
              "      <td>0.78972</td>\n",
              "      <td>0.08867</td>\n",
              "      <td>2050</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>17</th>\n",
              "      <td>scifact</td>\n",
              "      <td>PQ192x4fs</td>\n",
              "      <td>0.64046</td>\n",
              "      <td>0.58823</td>\n",
              "      <td>0.78933</td>\n",
              "      <td>0.08867</td>\n",
              "      <td>622</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>18</th>\n",
              "      <td>scifact</td>\n",
              "      <td>PQ96x4fs</td>\n",
              "      <td>0.62256</td>\n",
              "      <td>0.57773</td>\n",
              "      <td>0.74861</td>\n",
              "      <td>0.08400</td>\n",
              "      <td>375</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>19</th>\n",
              "      <td>scifact</td>\n",
              "      <td>SQ1</td>\n",
              "      <td>0.58724</td>\n",
              "      <td>0.53418</td>\n",
              "      <td>0.73989</td>\n",
              "      <td>0.08267</td>\n",
              "      <td>338</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>20</th>\n",
              "      <td>scifact</td>\n",
              "      <td>PQ48x4fs</td>\n",
              "      <td>0.52292</td>\n",
              "      <td>0.46611</td>\n",
              "      <td>0.68744</td>\n",
              "      <td>0.07700</td>\n",
              "      <td>251</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-95eadd9f-8a9c-4c3b-a201-5c941a536783')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-95eadd9f-8a9c-4c3b-a201-5c941a536783 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-95eadd9f-8a9c-4c3b-a201-5c941a536783');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-6f51c34c-1db4-4060-a2cc-3d1944db1387\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-6f51c34c-1db4-4060-a2cc-3d1944db1387')\"\n",
              "            title=\"Suggest charts.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-6f51c34c-1db4-4060-a2cc-3d1944db1387 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "    </div>\n",
              "  </div>\n"
            ]
          },
          "metadata": {},
          "execution_count": 11
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Review\n",
        "\n",
        "Each of the sources above were run through a series of scalar and product quantization settings. The accuracy vs disk space trade off is clear to see.\n",
        "\n",
        "Couple key points to highlight.\n",
        "\n",
        "- The vector model outputs vectors with 384 dimensions\n",
        "- Scalar quantization (SQ) was evaluated for 1-bit (binary), 4 and 8 bits\n",
        "- 1-bit (binary) quantization stores vectors in [binary indexes](https://github.com/facebookresearch/faiss/wiki/Binary-indexes)\n",
        "- For product quantization (PQ), three methods were tested. 48, 96 and 192 codes respectively, all using 4-bit codes\n",
        "\n",
        "In general, the larger the index size, the better the scores. There are a few exceptions to this but the differences are minimal in those cases. The smaller scalar and product quantization indexes are up to 20 times smaller.\n",
        "\n",
        "It's important to note that the smaller scalar methods typically need a wider number of dimensions to perform competitively. With that being said, even at 384 dimensions, binary quantization still does OK. txtai supports scalar quantization precisions from 1 through 8 bits.\n",
        "\n",
        "This is just a subset of the available quantization methods available in Faiss. More details can be found in the [Faiss documentation](https://github.com/facebookresearch/faiss/wiki/The-index-factory)."
      ],
      "metadata": {
        "id": "2shnHFK17S4u"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "# Wrapping up\n",
        "\n",
        "This notebook evaluated a variety of vector quantization methods. Quantization is an option to reduce storage costs at the expense of accuracy. Larger vector models (1024+ dimensions) will retain accuracy better with more aggressive quantization methods. As always, results will vary depending on your data."
      ],
      "metadata": {
        "id": "ZrErGwtzFdBC"
      }
    }
  ]
}